Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            We study the fundamental problem of estimating the mean of a d-dimensional distribution with covariance Σ≼σ2Id given n samples. When d=1, \cite{catoni} showed an estimator with error (1+o(1))⋅σ2log1δn−−−−−√, with probability 1−δ, matching the Gaussian error rate. For d>1, a natural estimator outputs the center of the minimum enclosing ball of one-dimensional confidence intervals to achieve a 1−δ confidence radius of 2dd+1−−−√⋅σ(dn−−√+2log1δn−−−−−√), incurring a 2dd+1−−−√-factor loss over the Gaussian rate. When the dn−−√ term dominates by a log1δ−−−−√ factor, \cite{lee2022optimal-highdim} showed an improved estimator matching the Gaussian rate. This raises a natural question: Is the 2dd+1−−−√ loss \emph{necessary} when the 2log1δn−−−−−√ term dominates? We show that the answer is \emph{no} -- we construct an estimator that improves over the above naive estimator by a constant factor. We also consider robust estimation, where an adversary is allowed to corrupt an ϵ-fraction of samples arbitrarily: in this case, we show that the above strategy of combining one-dimensional estimates and incurring the 2dd+1−−−√-factor \emph{is} optimal in the infinite-sample limit.more » « less
- 
            Location estimation is one of the most basic questions in parametric statistics. Suppose we have a known distribution density f, and we get n i.i.d. samples from f(x − µ) for some unknown shift µ. The task is to estimate µ to high accuracy with high probability. The maximum likelihood estimator (MLE) is known to be asymptotically optimal as n → ∞, but what is possible for finite n? In this paper, we give two location estimators that are optimal under different criteria: 1) an estimator that has minimax-optimal estimation error subject to succeeding with probability 1 − δ and 2) a confidence interval estimator which, subject to its output interval containing µ with probability at least 1 − δ, has the minimum expected squared interval width among all shift-invariant estimators. The latter construction can be generalized to minimizing the expectation of any loss function on the interval width.more » « less
- 
            S. Koyejo; S. Mohamed; A. Agarwal; D. Belgrave; K. Cho; A. Oh (Ed.)
- 
            We explore the connection between outlier-robust high-dimensional statistics and non-convex optimization in the presence of sparsity constraints, with a focus on the fundamental tasks of robust sparse mean estimation and robust sparse PCA. We develop novel and simple optimization formulations for these problems such that any approximate stationary point of the associated optimization problem yields a near-optimal solution for the underlying robust estimation task. As a corollary, we obtain that any first-order method that efficiently converges to stationarity yields an efficient algorithm for these tasks. The obtained algorithms are simple, practical, and succeed under broader distributional assumptions compared to prior work.more » « less
- 
            We explore the connection between outlier-robust high-dimensional statistics and non-convex optimization in the presence of sparsity constraints, with a focus on the fundamental tasks of robust sparse mean estimation and robust sparse PCA. We develop novel and simple optimization formulations for these problems such that any approximate stationary point of the associated optimization problem yields a near-optimal solution for the underlying robust estimation task. As a corollary, we obtain that any first-order method that efficiently converges to stationarity yields an efficient algorithm for these tasks.1 The obtained algorithms are simple, practical, and succeed under broader distributional assumptions compared to prior work.more » « less
- 
            We explore the connection between outlier-robust high-dimensional statistics and non-convex optimization in the presence of sparsity constraints, with a focus on the fundamental tasks of robust sparse mean estimation and robust sparse PCA. We develop novel and simple optimization formulations for these problems such that any approximate stationary point of the associated optimization problem yields a near-optimal solution for the underlying robust estimation task. As a corollary, we obtain that any first-order method that efficiently converges to stationarity yields an efficient algorithm for these tasks. The obtained algorithms are simple, practical, and succeed under broader distributional assumptions compared to prior work.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                     Full Text Available
                                                Full Text Available